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BACKGROUND 


Ultrasound 


The  use  of  pulsed  ultrasound  to  identify  metals  with  unacceptable  concentrations  of  impurities  is  a  novel 
nondestructive  technique.  Impurities  in  metal  such  as  sulfur  stringers  may  directly  affect  the  frequency  response 
of  an  ultrasonic  signal  as  it  travels  through  the  material.  The  impact  that  the  impurities  have  on  the  spectrum 
ma>'  not  be  known,  only  that  certain  frequency  components  may  attenuate  while  others  may  resonate.  However, 
the  complex,  noisy,  and  inconsistent  nature  of  the  signals  obtained  using  this  technique  make  interpretation  of  the 
results  difficult. 

Neural  Networks 


Traditional  neural  networks  are  inspired  by  biological  models,  but  bear  little  resemblance  to  their 
biological  counterparts.  Recent  advances  in  network  design  have  made  the  neural  network  an  attractive 
alternative  to  conventional  pattern  recognition  techniques.  A  neural  network  is  a  collection  of  interconnected 
neuromimes.  A  neuromime  (Figure  1)  is  an  exemplification  in  electronic  circuitry  of  an  abstraction  of  a  living 
neuron.  The  neuron  is  seen  as  an  electronic  device  with  all  the  communication  between  neurons  via  synapses. 
The  synapse  multiplies  the  information  passing  through  it  by  a  scalar  (called  a  weight).  The  input  to  a 
neuromime  is  given  as 


(1) 


where 

Xj  =  net  input  to  current  neuron 

Wjj  =  weight  from  previous  layer  to  current  layer 

0,  =  output  or  acii\'iiy  of  a  neuron  at  previous  layer  or  input  to  layer 

The  neuromime  acts  as  an  integrator  with  a  nonlinear  output.  A  neural  network  represents  the  flow'  of 
information  from  input  to  output.  In  biological  neurons  there  is  a  flow'  of  ionic  currents,  but  in  neuromimes 
(neurons,  nodes)  the  activity  is  represented  by  a  scalar,  which  is  visualized  as  the  instantaneous  frequency  of  a 
pulse  generator  producing  a  pulse  position  modulated  signal.  This  frequency  is  sigmoidal  with  a  finite  upper 
asymptote.  Rumelhart  ct  al.  (ref  1)  suggest  the  following  logistic  model  for  this  activity: 


=  (- 


1  +  e 


(2) 


where 

Oj  =  output  or  activity  of  a  neuron  at  current  layer 

X.  =  net  input  to  current  layer 

0j  =  bias 
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The  bias  provides  a  threshold  for  activation.  These  output  scalars  pass  to  other  neurons  in  the  network 
through  the  synapses  (Equation  1).  The  positioning  of  the  synapses  determines  the  character  of  the  network. 

The  connections  usually  cause  the  neurons  to  fall  into  groups  called  slabs  or  layers.  A  typical  network  consists 
of  a  sequence  of  slabs  either  fully  or  randomly  interconnected  with  no  connections  among  neurons  sharing  a 
common  slab  (Figure  2). 

Neural  networks  are  dynamical  systems  governed  by  rules  that  transform  the  network  from  one  state  to 
another.  They  consist  of  a  series  of  interconnected  nodes,  with  activation  values  assigned  to  each  node  and 
weights  assigned  to  the  connections.  Information  is  stored  in  these  weights  using  a  learning  algorithm  to  modify 
the  interconnects.  The  nature  of  the  learning  algorithm  is  such  that  information  is  distributed  in  parallel  among 
all  nodes  in  the  network  as  the  network  converges  to  the  desired  response.  Convergence  implies  that  the 
network  has  found  regularities  and  correlations  in  the  input  data.  After  proper  training,  the  network  responds 
only  to  these  trends  in  the  data,  making  it  highly  insensitive  to  noisy  and  incomplete  data. 

A  popular  training  algorithm  (ref  1)  involves  presenting  feature  vectors  at  the  input  nodes  and  target 
vectors  at  the  output  nodes  of  a  multilayer  network  and  modifying  interconnects  to  implement  a  least  squares 
gradient  descent  (Equation  2).  True  gradient  descent  requires  the  learning  rate  T]  to  be  infinitesimally  small, 
however,  practical  applications  require  a  finite  value  that  can  lead  to  oscillations  in  Wj,.  A  momentum  term  is 
often  added  to  minimize  these  oscillations  while  increasing  the  learning  rate. 


Aw^.  =  T]i6 jo)  +  a^yvJ. 


(3) 


where 


AWji  =  weight  change 

T)  =  learn  rate 


5j  =  error  contribution  of  current  neuron 

Oj  =  afferent 


a.AWj,  =  momentum 

As  might  be  expected,  the  error  contribution  5j  for  output  units  is  proportional  to  the  difference  between 
the  desired  results  (targets)  and  the  current  output  (Equation  4).  The  computation  of  5j  for  hidden  units  is 
proportional  to  the  conU'ibution  each  unit  makes  to  the  errors  in  the  next  slab  (Equation  5). 


=  (tj  -  OpOjil  -  op 


(4) 


where 

5j  =  error  term  of  hidden  unit 

ij  =  target  or  desired  ouqiut 

Oj  =  current  output 
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(5) 


=  0/1 


-  o/E 

k 


where 

5j  =  error  term  of  hidden  unit 

Oj  =  current  output 

215kWkj  =  contribution  of  current  unit  to  error  in  next  layer 
k 


The  number  of  units  in  each  layer  is  most  often  empirically  determined,  and  care  must  be  taken  to  use  a 
large  enough  training  set  to  ensure  patterns  are  not  simply  memorized.  Training  time  can  be  prohibitive  in  many 
applications  so  special  purpose  neural  network  simulators  are  being  developed  to  exploit  the  inherent  parallelism 
of  the  networks. 

A  network’s  performance  is  often  measured  in  connection  updates  per  second  (CUPS).  This  can  be 
defined  as  the  rate  of  synapse  modifications  during  one  forward  and  one  backward  propagation  through  the 
network.  A  typical  8  MHz  PC- AT  is  capable  of  approximately  4K  CUPS,  while  a  25  MHz  80386DX  attains 
30K  CUPS.  The  network  used  for  gaining  in  this  experiment  is  executed  on  a  parallel  processing  system 
comprised  of  40  transputers  exercising  a  suite  of  homogeneous  processing  functions  running  under  an  CXTCAM 
harness  (ref  2).  OCCAM  pennits  the  decomposition  of  processes  into  parallel  procedures  that  execute 
simultaneously  on  different  transputers,  or  time  shared  on  a  single  fiansputer.  These  parallel  processes  can  only 
communicate  via  channels,  cannot  share  memory,  and  only  synchronize  when  communicating.  This  hierarchical 
decomposition  into  parallel  processes  allows  a  suite  of  neural  processes  to  be  executed  on  a  single  u^sputer, 
while  information  is  routed  between  nodes  in  the  network.  The  problem  domain  is  partitioned  into  subdomains 
across  the  mansputer  network.  The  transputers  are  configured  as  a  one-dimensional  hardware  mesh  topology. 
More  complicated  array  structures  are  possible  that  minimize  communication  traffic  and  introduce  disjoint  paths 
for  fault  tolerance.  However,  communication  protocol  is  greatly  simplified  by  using  only  two  of  the  four 
available  hardware  links.  In  addition,  the  granularity  of  the  network  can  easily  be  adjusted  without  reconfiguring 
these  links  in  order  to  optimize  the  compulation/communication  tradeoffs  and  achieve  a  satisfactory  load  balance. 

Two  algorithms  were  studied  to  address  the  computation/communication  problem.  The  first  partitions 
the  network  among  the  transputers  so  that  activation  and  synapse  values  are  maintained  at  each  node.  On  a 
forward  or  backward  pass,  values  are  computed  using  partial  data  as  it  becomes  available.  In  this  configuration, 
communication  overhead  is  reduced,  but  transputers  often  sit  idle  waiting  for  data.  In  addition,  the  size  of  the 
network  is  restricted  by  the  memory  limitations  at  the  nodes.  Only  400K  CUPS  was  attained  using  this 
technique  because  of  practical  limitations  on  the  number  of  transputers  used. 

An  alternative  approach  uses  data  partitioning  (ref  3)  to  achieve  faster  throughput  during  training.  It  can 
be  assumed  for  a  small  t),  AWjj  is  small  compared  to  Wj,  and  many  iterations  can  be  run  before  updating  Wj,. 
Consequently,  several  simulations  can  be  performed  in  parallel  on  different  training  patterns.  Therefore,  instead 
of  distributing  the  network  among  the  transputers,  the  same  network  is  implemented  on  each  transputer  using  a 
different  set  of  training  patterns.  On  a  backward  pass,  each  transputer  computes  Awj^  for  its  training  set  and 
broadcasts  it  to  the  host  where  Aw^,  is  accumulated  and  Wj;  updated.  The  host  processor  transmits  these  new 
weights  to  each  processor  as  the  process  repeats  recursively  until  the  desired  error  is  achieved.  3.5M  CUPS  was 
achieved  using  this  technique,  however  communication  limits  the  practical  use  of  this  approach  to  large  networks 
with  a  large  training  set.  In  addition,  stability  problems  may  arise  if  the  learning  rate  is  too  high  (>  0.5). 
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APPROACH 


Metals  with  similar  composition  but  different  impurity  concentrations  were  separated  into  two  groups 
and  machined  from  bar  stock.  Figure  3  shows  typical  impurity  concentrations  of  the  samples  in  each  of  the  two 
groups.  Since  grain  structure  may  affect  the  spectral  signature,  samples  from  each  group  were  heat  treated  to 
erase  coarse  multiphase  microstructure.  Figure  4  shows  the  different  grain  structures  in  each  of  the  groups  prior 
to  heat  treatment.  Figure  5  shows  the  grain  structure  resulting  from  heat  treatment. 

Figure  6  shows  the  laboratory  setup  that  is  used  to  collect  ultrasonic  signatures  from  the  test  samples. 

A  Panametrics  Pulser  Receiver,  Model  5052PR,  transmits  a  series  of  10  MHz  ultrasonic  pulses  through  a 
piezoelectric  transducer  coupled  to  the  sample  under  test.  The  pulses  travel  through  the  metal  and  are  reflected 
back  to  the  transducer  and  received  by  the  pulser.  As  the  pulses  travel  through  the  sample,  the  nature  of  the 
metal  determines  which  frequency  components  resonate  or  attenuate.  A  Panametrics  Stepless  Gate,  Model 
5052G,  gates  only  the  first  echo  received  by  the  pulser  to  a  Tektronix  497P  spectrum  analyzer.  The  spectrum 
analyzer  extracts  the  frequency  spectrum  of  the  echoes  and  transmits  this  information  through  an  TF.F.F.-4RS  bus 
to  the  PC  for  analysis.  A  neural  network  is  then  employed  to  identify  the  salient  features  of  these  signals  and 
produce  output  relating  to  the  quality  of  the  sample  being  tested. 

RESULTS 

Spectral  data  were  collected  at  different  locations  on  each  specimen  from  a  selection  of  samples  for 
each  metal  group.  Approximately  300  feature  vectors  comprised  of  100  normalized  spectral  components  were 
extracted  from  the  data.  A  100  x  20  x  2  network  was  trained  on  these  data  with  activity  of  the  two  output 
neurons  used  to  classify  the  metals  into  one  of  the  two  groups.  The  network  converged  to  a  solution  using  the 
salient  features  of  the  spectra!  signature  to  classify  the  samples  from  each  group.  The  network  was  trained  on 
the  transputers  and  then  ported  to  the  PC  where  the  input  feature  vectors  can  be  obtained  from  the  spectral 
components  in  real  time.  The  system  was  tested  on  the  training  samples,  as  well  as  the  remaining  bar  stock,  and 
successfully  classified  the  specimens  from  each  group. 

This  technique  may  be  expanded  to  identify  a  range  of  impurity  concentrations  (per  ASTM  E-45-87, 
"Standard  Practice  for  Determining  the  Inclusion  Content  of  Steel")  by  employing  a  different  network 
architecture.  Complex  geometries  found  in  practice  will  likely  require  larger  networks.  Self-organizing  systems 
that  model  the  probability  distribution  of  the  training  data  are  being  studied  to  reduce  network  size. 
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Figure  1 .  Typical  Neuromime 


input  hidden  output 

buffer  layer  buffer 


Figure  2.  Three  Layer  Network 


output 


Figure  3.  Typical  impurity  concentrations  of  the  sample  groups. 
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'igiire  5.  Resulting  grain  structure  after  heat  treatment. 


Figure  6.  Laboratory  setup. 
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